Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use sqlite3_column_type() for ColumnTypeScanType() #1327

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alkemir
Copy link

@alkemir alkemir commented Mar 17, 2025

The current implementation was changed in PR-909 stating that sqlite3_column_type() always returns SQLITE_NULL, but as noted in ISSUE-600, the case was probably because for SQLite3, sqlite3_column_type() must be called after calling Next(). sqlite3_column_type() returns the types for the columns on a given row, thus the cursor must be pointing to a valid row. See SQLite docs on API lifecycle, rows and columns methods and this forum post

The Go API does not restrain the behavior of RowsColumnTypeScanType() as to when it should be called. Thus, the implementation in this PR makes a compromise and falls back to sqlite3_column_decltype() in a best effort to remain retro-compatible, but this is still a potentially breaking change.

The advantage of using sqlite3_column_type() over sqlite3_column_decltype() is two-fold:

  1. It is closer to the SQLite3 API
  2. It will return the correct scanType for expressions and aggregate functions, which are the cases in which sqlite3_column_decltype() is insufficient. Regarding this aspect, the existing implementation provides no way to access the values without potentially triggering a type conversion, unless the application is aware of the result type of the query beforehand.

sqlite3_type.go Outdated
return TYPE_NULLSTRING
case C.SQLITE_BLOB:
return TYPE_RAWBYTES
//case C.SQLITE_NULL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't apply, then just remove the code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its to document that the omission is intentional and not a miss, this is the actual value that we would be getting in 2 cases:

  1. If we call before we have a valid row
  2. If the value for the row-column is NULL

Happy to remove it or add a comment clarifying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a mistake. An explicit comment is better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment. Please take another look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

SQLITE_NULL
)

var (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these all public?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had my Java brain on. Making them private.

sqlite3_type.go Outdated
TYPE_RAWBYTES = reflect.TypeOf(sql.RawBytes{})
TYPE_NULLBOOL = reflect.TypeOf(sql.NullBool{})
TYPE_NULLTIME = reflect.TypeOf(sql.NullTime{})
TYPE_ANY = reflect.TypeOf(new(any))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is unfortunate - this means we are saying *any instead of any. I know that's the existing behavior, but does that even work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*any is the expected type, see the Rows.Scan() docs:

// If an argument has type *interface{}, Scan copies the value
// provided by the underlying driver without conversion. When scanning
// from a source value of type []byte to *interface{}, a copy of the
// slice is made and the caller owns the result.

Relevant code:

case nil:
    switch d := dest.(type) {
    case *any:
	if d == nil {
		return errNilPtr
	}
	*d = nil
	return nil

In our case, this works:

	for rr.Next() {
		cc, err := rr.ColumnTypes()
		if err != nil {
			log.Fatal(err)
		}

		pointers := make([]any, 0)
		for _, ct := range cc {
			fmt.Printf("%s ", ct.ScanType())
			pointers = append(pointers, reflect.New(ct.ScanType()).Interface())
		}
		fmt.Println()

		if err := rr.Scan(pointers...); err != nil {
			log.Fatal(err)
		}

		fmt.Println(pointers...)
		for _, p := range pointers {
			if pn, ok := p.(**any); ok {
				fmt.Printf("%t ", *pn == nil)
			}
		}

	}

sql.NullInt64 sql.NullString sql.NullFloat64
&{1 true} &{hello true} &{3.1415 true}
sql.NullInt64 *interface {} *interface {}
&{2 true} 0xc000064090 0xc000064098
true true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unrelated. Rows.Scan is talking about its arguments, which by necessity all have to be pointers as they are essentially output parameter. Meanwhile, ColumnTypeScanType is talking about the type to scan into (so not a pointer).

For example, ColumnTypeScanType mentions returning reflect.TypeOf(int64(0)) while Rows.Scan mentions *int64.

You'll also notice that all the other types being returned (e.g., TYPE_NULLSTRING) here are not pointers.

In other words, you are supposed to create a variable of the type from ColumnTypeScanType, and then pass a pointer to it to Rows.Scan.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will return interface{} now, following Go 1.21 , although this has been changed for Go 1.24 for compatibility.

Still, this might break code depending on the previous behavior.

func (rc *SQLiteRows) ColumnTypeScanType(i int) reflect.Type {
//ct := C.sqlite3_column_type(rc.s.s, C.int(i)) // Always returns 5
switch C.sqlite3_column_type(rc.s.s, C.int(i)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the SQL statement does not currently point to a valid row, or if the column index is out of range, the result is undefined.

We cannot call this if we haven't called Next yet, or if we have iterated past the last row, even if it happens to work today.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if you mean

  1. SQLiteRows.ColumnTypeScanType() should make sure that we have a valid row so it does not run into undefined behavior, in which case it should signal an error condition, or
  2. The driver should not call C.sqlite3_column_type() except inside Next(), which is the only point at which we can assure that we have a valid row.

If it is the first, the method signature does not allow a clear way of signaling errors (except for the empty interface, which signals not-supported) and I can't find a way to know if we have a valid row. The design of the packages (sql and go-sqlite3) seems to imply this should not be a concern beyond Next() and Scan(). Also the docs do not prescribe a context in which it is ok to call this method, but they do for other methods like Close() and Scan() implying it should always be ok to call SQLiteRows.ColumnTypeScanType(). The risk here is that the undefined behavior might mask as a type that is actually invalid for scanning. But, the implication is that the calling code would then try to Scan(), at which point it would get an error, so I see the risk as mitigated.

If it is the second, it is troublesome because that means there is no way of getting the rows without risking a type conversion unless the types are known by the calling code.

Copy link
Collaborator

@rittneje rittneje Mar 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a method on SQLiteRows, which certainly has the ability to know whether Next has been called yet, and whether the last call returned io.EOF. See sqlite3_stmt_busy.

Also, "undefined behavior" means that anything could happen, including an application crash, now or in the future.

@rittneje
Copy link
Collaborator

Please add a unit test for the new behavior.

@alkemir
Copy link
Author

alkemir commented Mar 24, 2025

Should we still return the nullable types? When SQLite returns a NULL value, the column type will be type_any, so there is no need for nullable types now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants